Multiomics analysis of the mechanisms behind flavonoid differences between purple and green tender shoots of Camellia sinensis var. assamica

Abstract Flavonoids are rich in tea plants (Camellia sinensis), and responsible for the flavor and healthful benefits of tea beverage. The anthocyanin levels in the purple tender shoots are higher than in the general green leaves of tea plant, which provide special materials to search metabolic mechanisms of flavonoid enrichment in plant. In this work, flavonoid differences between purple and green shoots from tea cultivars “Zijuan” (ZJ) and “Yunkang10” (YK-10) were investigated through metabolomic analysis, and mechanisms for their difference were surveyed by comparative transcriptomic and proteomic analysis. Levels of 34 flavonoids were different between ZJ and YK-10 shoots. Among them, 8 and 6 were marker metabolites in ZJ and YK-10, respectively. The differentially expressed genes (DEGs), differentially expressed proteins (DEPs), and different-level metabolites (DLMs) between ZJ and YK-10 were researched, respectively; and interactions including DEG-DLM, DEP-DLM, DEG-DEP, and DEG-DEP-DLM were analyzed; the contents of 18 characteristic flavonoids in tea leaves and expressions of 34 flavonoid metabolic genes were measured to verify the omics results. Integrated above analyses, a proposed model of flavonoids biosynthesis in tea shoots were established. The differential expression of the leucoanthocyanidin reductase (LAR), anthocyanidin synthase (ANS), anthocyanidin reductase (ANR), UDPG-flavonoid glucosyltransferase (UGT) 75L12 and 94P1 at gene level, and the ANS, ANR, and UGT78A15 at protein level, were closely associated with differences in flavonoids between ZJ and YK-10 shoot. Together, this study provides new information on the flavonoid accumulation mechanism in tea plant.

Tea is the most consumed beverage after water, and it is the dried product of tender tea shoots of the tea plant [Camellia sinensis (L.) O. Ktze]. Flavonoids, especially catechins, such as (-)-epigallocatechin-3-gallate (EGCG) and (-)-epigallocatechin (EGC), are important bioactive and flavorful substances in tea (Kim et al. 2014;Reygaert 2018;Davis and Running 2021). Interestingly, in many countries and regions, like China, India, Kenya, and Japan, there are some cultivars have red or purple tender shoots, such as "Zijuan" (ZJ), "Ziyan," and "TRFK 306" (Jiang et al. 2013;Kerio et al. 2013;Joshi et al. 2015;Lai, Li et al. 2016;Maritim, Korir et al. 2021). They are color mutants of the tea plant with green shoot and can be inherited stably through asexual reproduction. The percentage of anthocyanins in the red or purple tender shoots is 1%-3% of dry weight, which is higher than that in green shoots (Saito et al. 2011;Jiang et al. 2013). In addition, the phenolic acids, amino acids, alkaloids and flavonoids like flavan-3-ols, proanthocyanins, flavonol, and flavone glycosides were also differentially abundant metabolites between purple and green tea leaves. The fold change values of flavonoids were obviously higher than that of other compounds . Therefore, tea germplasms with purple shoots can be used for studying chemical compositions and physiological functions of anthocyanins, as well as the mechanism of flavonoid biosynthesis.
Here, we explored the differences in flavonoids between the ZJ purple shoots and Yunkang-10 (YK-10) green shoots using metabolomics and high-performance liquid chromatography (HPLC) analyses. To improve the accuracy and specially explore the flavonoids biosynthesis mechanism, the targeted metabolome was chosen to analyze the flavonoids difference, both the primary and secondary spectrum data were qualitatively analyzed. The differences in expression levels of genes and enzymes involved in flavonoid biosynthesis were investigated using transcriptomics and proteomics analyses. The correlations between flavonoids and biosynthesized genes of enzymes were investigated and further verified by real-time quantitative polymerase chain reaction (RT-qPCR) and HPLC measurement. Our research increases our understanding of flavonoids and their formation mechanism in the tea plant.

Plant materials and samples preparation
All the plant materials were collected from the Pu'er City Institute of Tea Science (N 22 02 0 -24 50 0 , E 99 09 0 -102 19 0 ), Yunnan Province, China, on September 30, 2018. Under the same picking standard (1 bud and 2 leaves), all replicate samples picked from healthy tea plants with similar growth status to ensure highly consistent. One bud and 2 leaves of purple shoots from ZJ and green shoots from YK-10 were collected for metabolomics, transcriptomics, and proteomics analyses, each with 3 biological duplications. The expression levels of 34 flavonoid biosynthetic genes were determined using RT-qPCR, and the contents of 18 tea characteristic flavonoids were measured using by HPLC (Fig. 1). The samples used for HPLC and metabolomics analyses were fixed and dried by microwave, and the samples used for transcriptomics, proteomics and RT-qPCR analyses were immediately frozen in liquid nitrogen and stored at À80 C.

Transcriptomic analysis
Total RNA was extracted using the TRIzol method and evaluated using a NanoDrop 2000 Spectrophotometers (Thermo Fisher Scientific, Wilmington, USA) and Agilent 2100 Bioanalyzer (Agilent, California, USA). A cDNA library was constructed by amplifying full-length cDNA to generate full-length transcriptome through PacBio Isoform sequencing, as described in our previous report (Chen, Shi et al. 2020). Another cDNA library was constructed with amplified cDNA of greater than 300 bp to generate the 2 nd generation transcriptome using Illumina HiSeq2000 sequencing. High-quality isoforms or clean reads were obtained after filtration, and then were mapped to the YK-10 genome (http:// www.plantkingdomgdb.com/tea_tree/) (Xia et al. 2017). The mapped reads were assembled using Cufflinks software and Reference Annotation Based Transcripts method (Trapnell et al. 2010;Roberts et al. 2011). Then, the assembled results of different replicate samples were merged for further analysis.
The gene expression levels were calculated using the Fragments Per Kb per Million reads (FPKM) method, and the differentially expressed genes (DEGs) were filtered using edgeR (https://bioconductor.org/packages/release/bioc/html/edgeR. html) with the criteria False Discovery Rate < 0.05 and jlog 2 [Fold Change (FC)]j > 1. Based on the genes annotation description of reference genome, the DEGs were used for cluster and Gene Ontology (GO) (http://www.geneontology.org/) and Kyoto Encyclopedia of Genes and Genomes (KEGG) significant enrichment analyses (Kanehisa et al. 2008). The new genes (length ! 200 bp and exon number ! 2) were annotated by blasting into non-redundant proteins sequences database in NCBI (http:// www.ncbi.nlm.nih.gov/) and KEGG (E-value < 1 Â 10 À5 ) database.

Proteomic analysis
Proteins were extracted from tea shoots using the fractional precipitation method with organic solvents, and protein concentrations were measured using a Bicinchoninic Acid detection kit (P0012, Biyuntian, Shanghai, China). Extracted proteins were labeled using an iTRAQ label kit (AB SCIEX, Framingham, USA). The labeled peptides were pre-separated using high-pH reversedphase liquid chromatography and analyzed by low-pH nano-HPLC-MS/MS (Orbitrap Fusion Tribrid, Thermo Fisher Scientific, Wilmington, USA) . Then, the MS data were analyzed using the Mascot search engine 2.3.2 (Matrix Science, London, UK). The Mascot database was established for protein identification using the tea plant genome.
According to the GO and KEGG annotations, enrichment analyses of the target proteins were fulfilled under Fisher's exact test. The medians of relative expression levels after all "Unique Peptides" normalization were taken as the protein expression level. The differentially expressed proteins (DEPs) were filtered using the criteria jFCj >1.2 and P <0.05. On the basis of gene/protein expression profiles, correlations among the quantity and the GO and KEGG enrichments between the proteome and transcriptome were analyzed. The 4/9-quadrant figures were used for separating the correlated genes and proteins. To better identify the key genes/proteins, the GO and KEGG annotations of the DEPs and DEGs were further analyzed.
According to metabolites abundance and gene/protein expression levels, the correlations between metabolome and transcriptome/proteome were analyzed. Three models were established to analyze the correlations, the KEGG Pathways, Bidirectional orthogonal projections to latent structures and Pearson's correlation coefficients. Furthermore, the correlations among the metabolome, proteome, and transcriptome were analyzed on the basis of the DLMs, DEPs, and DEGs involved in the flavonoid synthetic pathways.  luteolin (Lu), in ZJ and YK-10 shoots were determined by HPLC (Agilent 1200, USA) with a TSKgel ODS-80TM column (4.6 mm U Â 250 mm, 5 mm, TOSOH, Japan). In total, 8 ml of chromatographic methanol was added to 0.20 g tea-leaf powder and refluxed for 1.5 h. Then, the 10 mL diluted extract was filtered with a 0.45 mm nylon membrane. Each sample was extracted twice, and each extraction was detected 3 times.

HPLC determination of the 18 flavonoids
The set conditions of the HPLC were as follow: mobile phase A was 5% acetonitrile and 0.261% phosphoric acid, phase B was 80% methanol; flow rate was 0.8 ml/min; column temperature was 35 C; injection volume was 2 ll; and the detection wavelengths were 280 nm for catechins, 530 nm for anthocyanins, and 360 nm for flavonols. The elution gradient program was 0-16 min, and B increased from 10% to 45% linearly; 16-22 min, B from 45% to 65%; 25-25.9 min, B from 65% to 100%; 25.9-29.0 min, B maintained 100%; 30 min, B decreased from 100% to 10% linearly and was then maintained at 10% for 6 min.

RT-qPCR analysis of genes expression
The total RNA was extracted using a total RNA Extraction Kit for polysaccharide polyphenolic plants (TIANGEN, Beijing, China) and reverse transcribed to cDNA using a PrimeScript RT reagent Kit with gDNA Eraser (TaKaRa, Dalian, China). Using the TB Green Premix Ex Taq II Kit with Tli RNaseH Plus (TaKaRa), the 20-ll reaction system for RT-qPCR was established as follows: 10 ll TB Green Premix Ex Taq II (2Â), 0.8 ll forward/reverse primers (10 lM), 0.4 ll ROX Reference Dye II (50Â), 2 ll cDNA and 6 ll ddH 2 O.
Using the software Primer 3 Plus, the primers for RT-qPCR were designed (Supplementary Table 1). The reaction program for the RT-qPCR was as follows: 95 C pre-denaturation for 30 s, 40 cycles of 95 C denaturation for 5 s and 60 C extension for 30 s; followed by 60-95 C for a melting curve analysis. The relative expression levels of genes were calculated using the 2 -DDCT method (Livak and Schmittgen 2001).

Statistical analyses
The means and standard deviation represent the average values of 3 determination results, and P <0.05 was considered as significant difference after t-tests among independent samples. The SPSS 22 software (Cronk 2016) (IBM SPSS Statistics, Chicago, USA) and TBtool  were used for statistical analyses, and SIMCA14.1 software was used for PCA and OPLS-DA analysis. The heatmap, network diagram and scatter plot were constructed using Heml, Cytoscape and Excel software, respectively.
Flavonoid biosynthetic gene expression differences between ZJ and YK-10 shoots Using Illumina HiSeq2000 sequencing and data analysis, 3,531 and 3,215 genes were identified as having higher and lower expression levels in ZJ compared with in YK-10, respectively. Additionally, 64, 22, 12, and 3 DEGs were assigned into the KEGG pathways of phenylpropanoid biosynthesis (ko00940), flavonoid biosynthesis (ko00941), anthocyanin biosynthesis (ko00942), and flavone & flavonol biosynthesis (ko00944), respectively, which are related to the biosynthesis of flavonoid.

Flavonoids biosynthetic enzyme expression differences between ZJ and YK-10 shoots
Using a proteomics analysis, 4,186 proteins were identified. In comparison to YK-10, 415 and 383 proteins showed increased and decreased expression levels in ZJ (Fig. 3a). Among them, 25, 3, and 1 DEP were enriched int phenylpropanoid, flavonoid and anthocyanin biosynthesis pathways, respectively, and these pathways were among the 20 top enriched pathways (Fig. 3b). Among the 29 DEPs, 3 PALs, 1 CHI, 1 ANS, 1 ANR, and 1 UGT78A15 were directly involved in the biosynthesis of catechins and anthocyanins (Supplementary Table 5).
To further understand the functions of the DEPs in flavonoid biosynthesis, the correlations between DEPs and DEGs of ZJ and YK-10 were analyzed. The expression abundances of 144 DEPs were associated with the corresponding coding DEG levels, including 130 positively and 14 negatively correlated DEP-DEG pairs (Fig. 3c). Among them, 13 and 3 DEP-DEG pairs were enriched in the phenylpropanoid and flavonoid biosynthetic pathways, respectively. Only 4 DEP-DEG pairs were related to flavonoid biosynthesis. Among them, CHI (CSA008261), ANS (CSA011508), and ANR (CSA011986) showed the same expression trends as their corresponding DEGs, but PAL (CSA016076) showed a different expression trend ( Fig. 3d; Supplementary Table 6a).
Flavonoid biosynthetic (2 DLMs and 4 DEPs) and anthocyanin biosynthetic (6 DLMs and 1 DEP) pathways were both annotated in the DLM and DEP correlation analysis. The expression of UGT (CSA021432) was positively correlated with the levels of 6 anthocyanin 3-O-glucosides. The DEPs expression of CHI (CSA008261, 006623) and ANR (CSA011986) were positively correlated with the flavanol accumulations of (À)-epiafzelechin and afzelechin, whereas the expression of ANS (CSA011508) was negatively correlated with their levels (Fig. 4, d and e; Supplementary Table 6c). Interestingly, the expression of CHI (CSA008261) and ANR (CSA011986) showed positive correlation, while expression of ANS (CSA011508) showed negative correlation with the abundances of (À)-epiafzelechin and afzelechin, respectively.

Integrated analysis of transcriptomic, proteomic and metabolomic data
To gain a deeper understanding of the flavonoid metabolic mechanisms, the transcriptome-proteome-metabolome correlations were analyzed on the bases of DEGs-DEPs-DLMs involved in the flavonoid pathway ( Fig. 5; Supplementary Table 7). In the flavonoid biosynthetic pathway, the expression of most DEGs showed opposite correlations with the expression of ANS (CSA011508) and accumulations of (À)-epiafzelechin and afzelechin. The accumulations of (À)-epiafzelechin and afzelechin presented negative correlations with the expression of ANS (CSA011508) but had no direct correlation with that of ANR (CSA011986). The DEGs expression of DFR (CSA035727) and FLS (CSA008358) showed negative associations with the DEP level of ANS, while they showed positive association with DEP level of ANR and DLMs accumulations of (À)-epiafzelechin and afzelechin. However, the expression of DFR (CSA003949) and HCT (CSA011696) positively correlated with level of ANS but negatively correlated with level of ANR, (À)-epiafzelechin, and afzelechin.
In the anthocyanin biosynthetic pathway, the expression of 8 UGT genes (UGT94P1 and UGT75L12/13) correlated with the accumulation of 5 anthocyanin 3-O-glucosides. Only the expression of 4 DEGs (2 UGT94P1 and 2 UGT75L12) directly correlated with the levels of DEP UGT78A15 (CSA021432) and anthocyanin 3-O-glucosides at the same time. The levels of UGT94P1 (CSA005965) and UGT75L12 (CSA008693) were negatively correlated with the levels of UGT78A15 and anthocyanin 3-O-glucosides, but those of the UGT94P1 (CSA007394) and UGT75L12 (CSA036671) showed opposite correlations. In addition, the expression of UGT94P1 (CSA026000) was positively correlated with the level of the DEP CHI (CSA008261, 006623), but it showed no correlation with the abundances of anthocyanin 3-O-glucosides. In the flavone and flavonol biosynthetic pathway, the expression of the DEGs F3 0 5 0 H (CSA031792) and CCoAOMT1 (CSA028429) presented negative and positive correlations, respectively, with the flavone glycosides levels of rhoifolin, quercitrin, and vitexin 2 00 -O-b-L-rhamnoside. There was no DEP level in this pathway was associated with DEG and DLM levels. However, the DEG expression of C12RT1 (CSA015614) showed correlations with the DEP and DLM levels in other pathways. Generally, the connections among the DEG, DEP, and DLM levels were not limited to the same pathway.

Verification of the correlations between gene expression and flavonoid contents
To verify the UPLC-MS/MS and RNA-seq results, 18 flavonoids and 34 biosynthetic genes in ZJ and YK-10 shoots were detected using HPLC and RT-qPCR, respectively (Fig. 6, a and b). Among them, 14 flavonoids and 30 genes had been determined using omics technology. In ZJ/YK-10, the contents of 9 flavonoids showed similar trends in the metabolomics analysis and HPLC measurements. Additionally, the expression patterns of 20 genes showed similar trends in the RNA-seq analysis and RT-qPCR. Additionally, the differences in flavonoid and genes levels were mostly not significant, as indicated by the absolute log 2 FC values being less than one in the omics data (Fig. 6, c and d). The HPLC results showed that the contents of Cy and 6 anthocyanins in ZJ were significantly greater than in YK-10, but the contents of 8 catechins showed no significant differences. Interestingly, the expression levels of most genes were higher in ZJ than in YK-10. Correlation analyses showed that the most significant correlations were positive, and the expression of most early genes played positive roles in catechin accumulation. The expression of ANS, which had a 17 times greater level in ZJ compared with in YK-10, positively correlated with the Cy and total anthocyanin accumulations. The expression of ANR, UGT75L121/ 123, SHT, and CCoAOMT2 also promoted anthocyanin biosynthesis. Additionally, the expression of DFR and the contents of EGCG showed negative correlations. The expression of ANS and the content of GC, as well as the expression of CHS1, CHI, and FLS3 and the level of Cy, showed negative correlations (Fig. 6e).

Discussion
Using the 2 nd transcriptome, 101 DEGs of ZJ/YK-10 encoding flavonoid biosynthetic enzymes were identified. The expression levels of 2 DFRs, 1 ANS (CSA011508), and 8 UGTs in ZJ were significantly higher than in YK-10. However, the expression of ANR (CSA011986) presented the opposite profile. Furthermore, the expression of ANS (CSA011508) and ANR (CSA011986) showed consistent trends with their encoding genes, and negatively and positively correlated with the DLMs of flavanols. Therefore, we speculate that the DEGs with high FPKM values, especially ANS (CSA011508) and ANR (CSA011986), play key roles in the accumulation and distribution of catechins and anthocyanins in ZJ. The high expression of ANS (CSA011508) and low expression of ANR (CSA011986) stimulate the accumulation of anthocyanins by reducing the conversion to catechins (Seitz et al. 2003;Mei et al. 2020).
In the DLM-DEG and DLM-DEP integration analyses, there were more correlated DLM-DEG pairs than DLM-DEP pairs involved in the flavonoids pathways. There were no interacting DLM-DEP pairs in the flavone and flavonol biosynthetic pathway, which may indicate that flavonoid accumulation is regulated mainly at the gene level but not at the protein level. The contents of 6 anthocyanidin 3-O-glucosides, 3 flavone glycosides and 2 flavanols were correlated with DEG/DEP expression. The 2 flavanols, (À)-epiafzelechin and afzelechin, are dimer and trimer units of propelargonidin, which is a proanthocyanidin (Omar et al. 2011). The expression of LAR (CSA014943, XLOC_016774) showed positive correlations with the contents of (À)-epiafzelechin and afzelechin. In our research, the lower level of LAR in ZJ was consistent with the reported results (Hossain et al. 2018;Mei et al. 2020), but the displayed correlations with (À)-epiafzelechin and afzelechin were noted for the first time. Therefore, we speculated that the lower expression of LAR resulted in the lower accumulations of proanthocyanidins in ZJ than in YK-10.
These special DEGs [ANS (CSA011508), ANR (CSA011986), LAR (XLOC_016774), UGT75L12 (CSA036671), UGT94P1 (CSA007394)] and DEPs [ANS (CSA011508), ANR (CSA011986), UGT78A15 (CSA021432)] may play indispensable roles in the molecular mechanisms associated with the purple color in ZJ shoots. The competition for reaction substrates may responsible for the opposite expression profiles between ANS and LAR, ANR genes, and ANS and ANR enzymes (Mei et al. 2020). Under the collective effects of these DEGs and DEPs, the ZJ shoots present as purple with the decrease in proanthocyanidins and increase in anthocyanins. The anthocyanins were further converted into anthocyanins glucosides, which are more stable structures   (Fig. 7).
In ZJ shoots, the total catechins, and especially anthocyanin, concentrations were higher than in YK-10 shoots, which corresponds to the quality characteristics of ZJ as described previously (Yang et al. 2009). Among the catechins, the content of EGCG analyzed by HPLC in ZJ was less than in YK-10, which was opposite of the UPLC-MS/MS data. This may be because the heating during the extraction process promotes the isomerization and reduction of EGCG . The higher concentration of GCG in our experiment, which is a transisomer of EGCG, was also probably related to the extraction method.
The RT-qPCR results confirmed that the higher expression levels of most biosynthetic genes were important reasons for higher catechin and anthocyanin levels in ZJ/YK-10(Aza-Gonzá lez et al. 2013). The roles of early genes in anthocyanin accumulation have been shown in other purple plants , and the correlations between Dp and Pg abundance and early gene expression levels were almost positive in our study. The high gene expression level and catalytic activity of UGT promotes flavonoid accumulation (He et al. 2018), and 5 UGT isoforms with higher values in ZJ were verified by HPLC. The differential expression levels of ANS (CSA011508), LAR (CSA014943), and UGT (CSA036672) identified by RT-qPCR were consistent with the transcriptome predictions. Furthermore, the ANS level correlated negatively with GC content but positively with Cy, Dp, Pg and total 6 anthocyanins content in our study. This was consistent with ANS expression promoting the anthocyanin accumulation in Solanaceous vegetables . This indicates that the proposed model of purple-color formation in tea shoots is credible to some extent.
In general, we attempted to understand more clearly the molecular mechanisms behind the flavonoid differences that exist in tea shoots showing color variations using omics integrated analyses. The key ANS, ANR, LAR, UGT75L12, and UGT94P1 genes and the key ANS, ANR, UGT78A15 enzymes were predicted to play essential roles in anthocyanin accumulation. This demonstrated that the anthocyanin accumulation was positively associated with ANS and UGT75L12 expression as determined by HPLC and RT-qPCR combined analyses. However, the results of the RT-qPCR and HPLC experiments were not totally consistent with those of the omics analyses. Furthermore, the expression profiles of flavonoid biosynthetic genes are not only influenced by many environment factors, such as light (Li et al. 2020) and abiotic Fig. 6. The HPLC analyses of the contents of 18 flavonoids (a, c), RT-qPCR analyses of 34 related biosynthetic genes (b, d) and correlation analyses between them (e). Note: a: the numbers represent contents of compounds (mg/g); * represents significant differences (log 2 FC > 1 or log 2 FC < À1).
stresses (Gai et al. 2020), they are also regulated at the transcriptional and translational levels (Wu et al. 2020). The regulator genes of transcription factors like MYB, bHLH, WD40 and complex of them involving in the flavonoid biosynthesis, and their regulation mechanism through target genes are complex but worth to study. It is the most important work in progress for our team to explore and verify the regulator genes using transgenic and other biotechnologies.

Data availability
All data generated or analyzed during this study are included in this published article and its Supplementary Materials. The raw data of transcriptome and proteome have been respectively deposited to Sequence Read Archive database (https://submit.ncbi. nlm.nih.gov/about/sra/) (PRJNA524304) and iProX database (https://www.iprox.org/) (PXD036754). The qualitative and quantitative analysis result of all flavonoids identified by target metabolome was uploaded as Supplementary Table 8.
Supplemental material is available at G3 online.